PGRP2016 - Discovery and evaluation of inbred-specific and hybrid-specific regulatory modules

"Discovery and Evaluation of Inbred-Specific and Hybrid-Specific Regulatory Modules", is supported by National Science Foundation grant IOS 1546899 awarded to Briggs and Springer.

Steven P. Briggs (PI), UC San Diego, sbriggs@ucsd.edu

Nathan M. Springer (co-PI), University of Minnesota, springer@umn.edu

Background

Progeny from sexual hybridization in plants, animals, and microbes typically display hybrid vigor, defined as the difference in vigor between parents and their hybrid progeny. Hybrid vigor is so substantial that most crops and livestock are now bred for use as hybrids, and natural inter-specific hybrids are displacing parental species from their niches. Hybrid vigor can provide a substantial increase in fitness with no obvious trade-offs. Despite this, there are no interventions in nutrition, medicine, or agriculture that exploit this opportunity to increase vigor other than its use in conventional breeding for crops and livestock. The absence of interventions arises from a nearly complete lack of understanding despite the practical importance of the phenomenon and a long history of scientific study including a detailed description by Charles Darwin in 1876. Molecular phenotypes of hybrid vigor could help explain hybrid vigor phenotypes and serve as proxies for mechanistic studies.

Differential gene expression underlies the relationships between genotypes and phenotypes. Gene expression is complex and regulation occurs at every step. Because of this complex regulation, the correlations between mRNA levels and protein levels are poor, and mRNA measurements are unreliable indicators of their cognate protein levels (Walley 2013).

We have combined proteomics and transcriptomics with data analytics to reconstruct regulatory networks between genes (Wally 2016). The transcriptome and the proteome were found to be, "telling the same story" and, surprisingly, they did it with different genes. For example, a genome-wide comparison of mRNA levels in 23 different tissues enabled a dendrogram of tissue relationships. The same dendrogram arose from proteome data and yet the informative genes (i.e., the principal components) were mostly different. At the gene level, the transcriptome cannot substitute for the proteome whereas at the level of ontological enrichments they give similar results. Discrepancies between the transcriptome and the proteome reveal regulatory processes in gene expression that would not be discovered otherwise.

Epigenomic mechanisms can regulate gene expression. We used machine learning to integrate epigenomic marks, such as DNA methylation and histone modifications, with transcriptome and proteome data. This revealed a new kind of gene regulation that provides heritable permissions for expression (Sartor 2019). Surprisingly, different epigenomic patterns were found to be associated with mRNA expression compared to protein expression. It is unclear how these chromatin marks regulate protein expression. The epigenomic patterns revealed by machine learning provided a robust way to classify every gene as expressible or silent. These classifications provided gene sets of all protein-coding, expressible genes that contained substantially more real genes and fewer false genes than those resulting from expert curation of the genome. Accurate gene sets are critical for genome-wide association studies and omics analyses to understand hybrid vigor.

Progress

Transcriptome comparisons of 23 different tissues or developmental stages of a hybrid to its inbred parents has revealed that ∼30 000 genes are expressed in at least one tissue of one inbred and an additional ∼10 000 ″silent" genes are not expressed in any tissue of any genotype; 90% of these are non-syntenic relative to other grasses (Zhou 2019). Approximately 74% of the expressed genes exhibit differential expression in at least one tissue. However, the majority of genes with differential expression do not exhibit consistent differential expression in different tissues. These genes often exhibit tissue-specific differential expression with equivalent expression in other tissues, and in many cases they switch the directionality of differential expression in different tissues. Therefore, the tendency to classify alleles as strong or weak may be misleading if only one or a few tissues are examined. Nearly 5000 genes are expressed in only one parent in at least one tissue (single parent expression) and 97% of these genes are expressed at mid-parent levels or higher in the hybrid. There is relatively little evidence for non-additive gene expression patterns that are maintained in multiple tissues. Our results suggest that complementation arising from differential gene expression may be tissue-specific and few genes are likely to be providing hybrid vigor in every tissue.

Precision proteomics has been applied to create a companion dataset with the same tissue powder samples as used for our transcriptomics (unpublished results). Each sample is labeled with a unique isotopic tag that enables samples to be quantitatively compared in the gas phase of the mass spectrometer with high precision. Clear patterns of expression have emerged that are not seen in the transcriptome data. Hundreds of genes are over-expressed in the hybrid at the protein level and not at the mRNA level. The differences in expression level are small and they could not have been observed with less precise methods. A small difference in cell division rate can have a large impact on the total number of cells over time if cell death rates are negligible. The differences we observe in expression are commensurate with the greater size and vigor of hybrids.

Literature Cited

Walley JW, Shen Z, Sartor R, Wu KJ, Osborn J, Smith LG, Briggs SP. Reconstruction of protein networks from an atlas of maize seed proteotypes. Proc Natl Acad Sci U S A. 2013 Dec 3;110(49):E4808-17. doi: 10.1073/pnas.1319113110. PMID: 24248366

Walley JW, Sartor RC, Shen Z, Schmitz RJ, Wu KJ, Urich MA, Nery JR, Smith LG, Schnable JC, Ecker JR, Briggs SP. Integration of omic networks in a developmental atlas of maize. Science. 2016 Aug 19;353(6301):814-8. doi: 10.1126/science.aag1125. PMID: 27540173

Sartor RC, Noshay J, Springer NM, Briggs SP. Identification of the expressome by machine learning on omics data. Proc Natl Acad Sci U S A. 2019 Aug 16. pii: 201813645. doi: 10.1073/pnas.1813645116. [Epub ahead of print] PMID: 31420517

Zhou P, Hirsch CN, Briggs SP, Springer NM. Dynamic Patterns of Gene Expression Additivity and Regulatory Variation throughout Maize Development. Mol Plant. 2019 Mar 4;12(3):410-425. doi: 10.1016/j.molp.2018.12.015. Epub 2018 Dec 27. PMID: 30593858

6108 Natural Science Building, MC 0380
9500 Gilman Drive
La Jolla, CA 92093-0380
(858) 534-4979